Comparing Document Segmentation Strategies for Passage Retrieval in Question Answering

نویسنده

  • Jörg Tiedemann
چکیده

Information retrieval (IR) techniques are used in question answering (QA) to retrieve passages from large document collections which are relevant to answering given natural language questions. In this paper we investigate the impact of document segmentation approaches on the retrieval performance of the IR component in our Dutch QA system. In particular we compare segmentations into discourse-based passages and window-based passages with either fixed sizes or variable sizes. We also look at the effect of overlapping passages and sliding window approaches. Finally, we evaluate the different strategies by applying them to our question answering system in order to see the impact of passage retrieval on the overall QA accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Boosting Passage Retrieval through Reuse in Question Answering

Question Answering (QA) is an emerging important field in Information Retrieval. In a QA system the archive of previous questions asked from the system makes a collection full of useful factual nuggets. This paper makes an initial attempt to investigate the reuse of facts contained in the archive of previous questions to help and gain performance in answering future related factoid questions. I...

متن کامل

Simple is Best: Experiments with Different Document Segmentation Strategies for Passage Retrieval

Passage retrieval is used in QA to filter large document collections in order to find text units relevant for answering given questions. In our QA system we apply standard IR techniques and index-time passaging in the retrieval component. In this paper we investigate several ways of dividing documents into passages. In particular we look at semantically motivated approaches (using coreference c...

متن کامل

Segmentation Strategies for Passage Retrieval from Internet Video using Speech Transcripts

We compare the effect of different segmentation strategies for passage retrieval of user generated internet video. We consider retrieval of passages for rather abstract and complex queries that go beyond finding a certain object or constellation of objects in the visual channel. Hence the retrieval methods have to rely heavily on the recognized speech. Passage retrieval has mainly been studied ...

متن کامل

Interactive Cross-Language Question Answering: Searching Passages versus Searching Documents

iCLEF 2004 is the first comparative evaluation of interactive Cross-Language Question Answering systems. The UNED group has participated in this task comparing two strategies to help users in the answer finding task: the baseline system is just a standard document retrieval engine searching machine-translated versions of the documents; the contrastive system is identical, but searching passages...

متن کامل

A Method of Passage-Based Document Retrieval in Question Answering System

We propose a method for using the scoring values of passages to effectively retrieve documents in a Question Answering system. For this, we suggest evaluation function that considers proximity between each question terms in passage. And using this evaluation function , we extract a documents which involves scoring values in the highest collection, as a suitable document for question. The propos...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007